271 research outputs found
Analyse déductive pour les grammaires d'interaction
We propose a parsing algorithm for Interaction Grammars using the deductive parsing framework. This approach brings new perspectives on this problem, departing from previous methods relying on constraint-solving techniques to interpret it as a graph-rewriting problem. Furthermore, this presentation allows a standard description of the algorithm and a fine-grained inspection of the sources of non-determinism
Connected Choice and the Brouwer Fixed Point Theorem
We study the computational content of the Brouwer Fixed Point Theorem in the
Weihrauch lattice. Connected choice is the operation that finds a point in a
non-empty connected closed set given by negative information. One of our main
results is that for any fixed dimension the Brouwer Fixed Point Theorem of that
dimension is computably equivalent to connected choice of the Euclidean unit
cube of the same dimension. Another main result is that connected choice is
complete for dimension greater than or equal to two in the sense that it is
computably equivalent to Weak K\H{o}nig's Lemma. While we can present two
independent proofs for dimension three and upwards that are either based on a
simple geometric construction or a combinatorial argument, the proof for
dimension two is based on a more involved inverse limit construction. The
connected choice operation in dimension one is known to be equivalent to the
Intermediate Value Theorem; we prove that this problem is not idempotent in
contrast to the case of dimension two and upwards. We also prove that Lipschitz
continuity with Lipschitz constants strictly larger than one does not simplify
finding fixed points. Finally, we prove that finding a connectedness component
of a closed subset of the Euclidean unit cube of any dimension greater or equal
to one is equivalent to Weak K\H{o}nig's Lemma. In order to describe these
results, we introduce a representation of closed subsets of the unit cube by
trees of rational complexes.Comment: 36 page
Feature Unification in TAG Derivation Trees
The derivation trees of a tree adjoining grammar provide a first insight into
the sentence semantics, and are thus prime targets for generation systems. We
define a formalism, feature-based regular tree grammars, and a translation from
feature based tree adjoining grammars into this new formalism. The translation
preserves the derivation structures of the original grammar, and accounts for
feature unification.Comment: 12 pages, 4 figures In TAG+9, Ninth International Workshop on Tree
Adjoining Grammars and Related Formalisms, 200
XMG : Un Compilateur de Méta-Grammaires Extensible
National audienceDans cet article, nous présentons un outil permettant de produire automatiquement des ressources linguistiques, en l'occurence des grammaires. Cet outil se caractérise par son extensibilité, tant du point de vue des formalismes grammaticaux supportés (grammaires d'arbres adjoints et grammaires d'interaction à l'heure actuelle), que de son architecture modulaire, qui facilite l'intégration de nouveaux modules ayant pour but de vérifier la validité des structures produites. En outre, cet outil offre un support adapté au développement de grammaires à portée sémantique
DCU-Paris13 systems for the SANCL 2012 shared task
The DCU-Paris13 team submitted three systems to the SANCL 2012 shared task on parsing English web text. The first submission, the highest ranked constituency parsing system, uses a combination of PCFG-LA product grammar parsing and self-training. In the second submission, also a constituency parsing system, the n-best lists of various parsing models are combined using an approximate sentence-level product model. The third system, the highest ranked system in the dependency parsing track, uses voting over dependency arcs to combine the output of three constituency parsing systems which have been converted to dependency trees. All systems make use of a data-normalisation component, a parser accuracy predictor and a genre classifier
Combining PCFG-LA models with dual decomposition: a case study with function labels and binarization
It has recently been shown that different NLP models can be effectively combined using dual decomposition. In this paper we demonstrate that PCFG-LA parsing models are suit- able for combination in this way. We experiment with the different models which result from alternative methods of extracting a gram- mar from a treebank (retaining or discarding function labels, left binarization versus right binarization) and achieve a labeled Parseval F-score of 92.4 on Wall Street Journal Section 23 – this represents an absolute improvement of 0.7 and an error reduction rate of 7% over a strong PCFG-LA product-model base- line. Although we experiment only with binarization and function labels in this study, there is much scope for applying this approach to other grammar extraction strategies
Statistical Parsing of Spanish and Data Driven Lemmatization
International audienceAlthough parsing performances have greatly improved in the last years, grammar inference from treebanks for morphologically rich lan- guages, especially from small treebanks, is still a challenging task. In this paper we in- vestigate how state-of-the-art parsing perfor- mances can be achieved on Spanish, a lan- guage with a rich verbal morphology, with a non-lexicalized parser trained on a treebank containing only around 2,800 trees. We rely on accurate part-of-speech tagging and data- driven lemmatization in order to cope with lexical data sparseness. Providing state-of- the-art results on Spanish, our methodology is applicable to other languages
Handling unknown words in statistical latent-variable parsing models for Arabic, English and French
This paper presents a study of the impact of using simple and complex morphological clues to improve the classification of rare and unknown words for parsing. We compare this approach to a language-independent technique
often used in parsers which is based solely on word frequencies. This study is applied to three languages that exhibit different levels of morphological expressiveness: Arabic, French and English. We integrate information
about Arabic affixes and morphotactics into a PCFG-LA parser and obtain stateof-the-art accuracy. We also show that these morphological clues can be learnt automatically
from an annotated corpus
Semi-supervised Dependency Parsing using Lexical Affinities
International audienceTreebanks are not large enough to reliably model precise lexical phenomena. This deficiency provokes attachment errors in the parsers trained on such data. We propose in this paper to compute lexical affinities, on large corpora, for specific lexico-syntactic configurations that are hard to disambiguate and introduce the new information in a parser. Experiments on the French Treebank showed a relative decrease of the error rate of 7.1% Labeled Accuracy Score yielding the best pars- ing results on this treebank
- …